Goto

Collaborating Authors

 Los Gatos





Multi-Objective Intrinsic Reward Learning for Conversational Recommender Systems

Neural Information Processing Systems

Conversational Recommender Systems (CRS) actively elicit user preferences to generate adaptive recommendations. Mainstream reinforcement learning-based CRS solutions heavily rely on handcrafted reward functions, which may not be aligned with user intent in CRS tasks.


Does Weighting Improve Matrix Factorization for Recommender Systems?

Ayoub, Alex, Robertson, Samuel, Liang, Dawen, Steck, Harald, Kallus, Nathan

arXiv.org Machine Learning

Matrix factorization is a widely used approach for top-N recommendation and collaborative filtering. When implemented on implicit feedback data (such as clicks), a common heuristic is to upweight the observed interactions. This strategy has been shown to improve performance for certain algorithms. In this paper, we conduct a systematic study of various weighting schemes and matrix factorization algorithms. Somewhat surprisingly, we find that training with unweighted data can perform comparably to, and sometimes outperform, training with weighted data, especially for large models. This observation challenges the conventional wisdom. Nevertheless, we identify cases where weighting can be beneficial, particularly for models with lower capacity and specific regularization schemes. We also derive efficient algorithms for exactly minimizing several weighted objectives that were previously considered computationally intractable. Our work provides a comprehensive analysis of the interplay between weighting, regularization, and model capacity in matrix factorization for recommender systems.


Orchestrating Human-AI Teams: The Manager Agent as a Unifying Research Challenge

Masters, Charlie, Vellanki, Advaith, Shangguan, Jiangbo, Kultys, Bart, Gilmore, Jonathan, Moore, Alastair, Albrecht, Stefano V.

arXiv.org Artificial Intelligence

While agentic AI has advanced in automating individual tasks, managing complex multi-agent workflows remains a challenging problem. This paper presents a research vision for autonomous agentic systems that orchestrate collaboration within dynamic human-AI teams. We propose the Autonomous Manager Agent as a core challenge: an agent that decomposes complex goals into task graphs, allocates tasks to human and AI workers, monitors progress, adapts to changing conditions, and maintains transparent stakeholder communication. We formalize workflow management as a Partially Observable Stochastic Game and identify four foundational challenges: (1) compositional reasoning for hierarchical decomposition, (2) multi-objective optimization under shifting preferences, (3) coordination and planning in ad hoc teams, and (4) governance and compliance by design. To advance this agenda, we release MA-Gym, an open-source simulation and evaluation framework for multi-agent workflow orchestration. Evaluating GPT-5-based Manager Agents across 20 workflows, we find they struggle to jointly optimize for goal completion, constraint adherence, and workflow runtime - underscoring workflow management as a difficult open problem. We conclude with organizational and ethical implications of autonomous management systems.



Correctness-Guaranteed Code Generation via Constrained Decoding

Li, Lingxiao, Rahili, Salar, Zhao, Yiwei

arXiv.org Artificial Intelligence

Language Models (LMs) are increasingly being used for code generation, but ensuring the correctness of generated programs remains a significant challenge. Although imperfect code may be acceptable during software development with human oversight, domains such as video games and robotics require one-shot correctness for runtime-critical components. W e present a constrained decoding algorithm for generating semantically correct programs that incorporates a context-sensitive parser, which, at each step, outputs a regular expression that satisfies a critical non-extensible property to guide the generation of the next token sequence that can continue to a correct program. T o build such a context-sensitive parser, we propose a framework of a dynamic tree of parsers (T oP) during parsing, where each parser corresponds to a modular context-free grammar enriched with contextual information such as variable scopes and type constraints, with tree branches representing ambiguity in the future code segment. W e demonstrate our approach through sLua, a strongly typed variant of Lua, showing that our method can generate semantically correct programs conforming to any prescribed scripting API. W e further show that, with careful design, our semantic guarantees extend to runtime correctness, as validated in the application of generating game mechanics for a roguelike video game.


Supplement to ' Autoencoders that don't overfit towards the Identity '

Neural Information Processing Systems

Eq. 1 in the paper (re-stated in Eq. 2 below), and show that it is equal to the objective function in the Theorem in the paper (see Eq. 8 below) up to the factor In the following, we provide the detailed steps. We start by re-stating Eq. 1 in the paper 1 n null null nullA The details are outlined in Sections 2.2 and 2.3 below. See Eq. 1 above for the definitions of X, multiplied by the dropout-probability p, and q = 1 p. In line 6, we change the sum over the m columns back to matrix notation. Finally, in line 8, we used the substitutions from Eq. 1 as to obtain In lines 11 and 12, the squared loss is expanded into its four terms.